Overlapping clustering based on kernel similarity metric
نویسندگان
چکیده
Producing overlapping schemes is a major issue in clustering. Recent proposed overlapping methods relies on the search of an optimal covering and are based on different metrics, such as Euclidean distance and I-Divergence, used to measure closeness between observations. In this paper, we propose the use of another measure for overlapping clustering based on a kernel similarity metric .We also estimate the number of overlapped clusters using the Gram matrix. Experiments on both Iris and EachMovie datasets show the correctness of the estimation of number of clusters and show that measure based on kernel similarity metric improves the precision, recall and f-measure in overlapping clustering.
منابع مشابه
Composite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملیادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملClustering with Adaptive Structure Learning: A Kernel Approach
Many similarity-based clustering methods work in two separate steps including similarity matrix computation and subsequent spectral clustering. However, similarity measurement is challenging because it is usually impacted by many factors, e.g., the choice of similarity metric, neighborhood size, scale of data, noise and outliers. Thus the learned similarity matrix is often not suitable, let alo...
متن کاملEuler Clustering
By always mapping data from lower dimensional space into higher or even infinite dimensional space, kernel k-means is able to organize data into groups when data of different clusters are not linearly separable. However, kernel k-means incurs the large scale computation due to the representation theorem, i.e. keeping an extremely large kernel matrix in memory when using popular Gaussian and spa...
متن کاملHierarchical Overlapping Clustering of Network Data Using Cut Metrics
A novel method to obtain hierarchical and overlapping clusters from network data – i.e., a set of nodes endowed with pairwise dissimilarities – is presented. The introduced method is hierarchical in the sense that it outputs a nested collection of groupings of the node set depending on the resolution or degree of similarity desired, and it is overlapping since it allows nodes to belong to more ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1211.6859 شماره
صفحات -
تاریخ انتشار 2010